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Abstract. 

Statistical properties of evolving random graphs are analyzed using kinetic theory. Treating the 
linking process dynamically, structural characteristics of links, paths, cycles, and components are 
obtained analytically using the rate equation approach. Scaling laws for finite systems are derived 
using extreme statistics and scaling arguments. 
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INTRODUCTION 

Random graphs have been studied in many disciplines including statistical physics, 
chemical physics, combinatorics, probability theory, and computer science [1, 2, 3, 4, 5, 
6]. For example, they are used to model percolation in polymerization processes [7, 8] 
and phase transitions in algorithmic complexity [9] . 

A random graph is a set of nodes that are joined by random links. When the number 
of links exceeds a threshold, a connected component containing a finite fraction of all 
nodes, the giant component, emerges. Essentially, random graphs are a mean-field model 
of percolation [10, 11]. 

In this short review, we summarize our recent work on random graphs [12, 13]. 
We describe how, by treating the linking process dynamically, random graphs can be 
studied using kinetic theory. Structures such as paths, cycles, and components grow 
via elementary aggregation processes and their distributions can be obtained using 
the rate equation approach [14, 15, 16, 17, 18, 19]. This technique complements the 
combinatorial methods, traditionally used to analyze random graphs [3, 4, 5]. 

THE EVOLVING RANDOM GRAPH 

A graph is a collection of nodes that are joined by links, and in a random graph, the links 
are random. There are different types of graphs. In a static graph, the links are generated 
instantaneously, while in an evolving graph the links are generated sequentially. In a 
simple graph, a given pair of nodes may be connected by a single link only, but in a 
multigraph they may be connected by multiple links. 

We consider the following random graph model. Starting with no links and /V discon- 
nected nodes, links are sequentially added between randomly selected pairs of nodes. 
This linking process continues ad infinitum with constant rate, set to (2N)~ l without 



loss of generality. The two nodes selected for linking may not be different, and addition- 
ally, the number of links between two nodes is not limited. In other words, we consider 
a random evolving multi-graph. Additionally, we consider the infinite system size limit, 
N — > oo, where statistical fluctuations can be usually ignored. 



LINKS, PATHS, AND CYCLES 

At time t , the total number of links is on average Nt / 2, and therefore, the average number 
of links per node (the degree) equals time t. Let / be the degree of a node. It undergoes 
the additive stochastic process / — > / + 1. The probability F\ that the degree of a node 
equals Z, the degree distribution, satisfies 

with the initial condition F/(0) = 5/ ; o- Therefore, the degree distribution is Poissonian 

Fl = T\ e ~ U (2) 

with the mean degree equal to time, (/) = t. 

A pair of nodes may be connected by a consecutive series of links forming a path. 
When a newly added link connects two paths of lengths n and m, a longer path is formed: 
n,m— >n + m+l. Thus, paths undergo an aggregation process. Let P t (t) be the density 
of distinct paths containing / links at time t. This density satisfies 

^= £ PnPm 0) 
at n+m=l-\ 

for / > and Po{t) = 1. The initial condition is P/(0) = 8{ y Q. Therefore, the path length 
density is 

Pl = t l . (4) 

For example, the first quantity P\ = t reflects that the link density is equal to t/2 and 
that every link corresponds to two distinct paths of length one. The total density of paths 
^tot = Hi Pi diverges at the percolation time, P tot = (1 — t)~ l as t — > 1. At this time, the 
system develops a giant component that eventually percolates through the entire system. 
The divergence of the total number of paths is typical: average quantities as well as 
typical characteristics diverge near the transition point. For example, the typical path 
length diverges as the percolation time is approached 

Z~(l-0 -1 , (5) 

as seen from the average path length, (/) = X)/ IPi/ = t(l —t) . 

When two nodes along a path are linked, a cycle forms. Cycles have been studied 
extensively [20, 21, 22, 23] and for example, they are useful for characterizing phase 



transitions in algorithmic complexity [9] . Let the average number of cycles of size / at 
time t be Qi{t). It is coupled to the path length density via the rate equation 



The right-hand side equals the link creation rate 1 / {IN) times the total number of paths 
NPi-i. As a result, the cycle length distribution is 

Thus, at the percolation time, the cycle length distribution is inversely proportional to 
the length, Qi(t = 1) = (2/) _1 . In general, size distributions decay exponentially away 
from the percolation point and algebraically precisely at the percolation point. The total 
number of cycles in the system Q tot = YaQi 1% 2tot = 3 m T^7- It weakly diverges as 
the percolation point is approached. We note that the average number of cycles is not 
an extensive quantity and for large systems, it saturates at a finite value. The number 
of cycles is therefore a non-self-averaging quantity, it fluctuates from realization to 
realization. 

What is the probability Sit) that the system contains no cycles at time tl Since the 
cycle formation process is random, and the cycle production rate is J = dQ tot /dt = 
2(i- t ) > men dS/dt = — SJ or alternatively, 

M = s__ 

dt 2(1-0' 

Therefore, the probability that the system contains no cycles decays with time as follows 

S=(l-t) 1/2 . (9) 

This survival probability shows that the system is bound to nucleate at least a single 
cycle prior to the percolation time. 

Following similar reasoning, properties of the first cycle may be obtained. The size 
distribution of the first cycle G; obeys a simple generalization of (6) 

The rate by which the first cycle is produced is simply the rate by which all cycles are 
produced times the probability that there are no cycles in the system. Since the first cycle 
must be produced by the percolation time, the final size distribution of the first cycle is 
G/(l) = \ Jq dtSPi-i and performing the integration gives 

r (t _ n _ r(3/2)r(/) 

This size distribution has an algebraic tail, G/(l) ~ Z~ 3 / 2 for / 3> 1. The characteristic 
exponent for the tail of G/ is larger than the characteristic exponent for the tail of Qf. the 
first cycle is created earlier, and thus, it must be smaller. 



FINITE COMPONENTS: SIZE DISTRIBUTION 



A component is a set of connected nodes: every pair of nodes in a component is 
connected by a path. Components merge due to linking: a link placed between two 
distinct components causes the two to join. There are i x j ways to join disconnected 
components of size i and j and hence, components undergo the aggregation process 
(z, j) — > i + j with the aggregation rate ij/ (2N). 

Let cjt(f) be the density of components containing k nodes at time t. The component 
size distribution obeys the nonlinear rate equation 

~t = \ £ {ici){jcj)-kc k . (12) 

al Z i+j=k 

The initial condition is cjt(O) = 5^,1 • The gain term represents merger between two 
components whose sizes sum up to k and the loss term accounts for links involving a 
node inside a component of size k. 

The moments of the size distribution, M n = Y,kk n Ck, provide a useful probe of the 
dynamics. For example, the second moment obeys the closed equation dM2/dt = M 2 
and with the initial condition M2(0) = 1, the solution is 

M 2 = (l-t)-\ (13) 

for t < 1. The divergence shows that the system undergoes a percolation transition. 
In a finite time t g = 1, an infinite component, the giant component, is formed. Past 
the percolation point, the giant component contains a finite fraction of the nodes, and 
eventually, it grows to engulf the entire system. 

The component size distribution can be obtained analytically 1 

c t (f) = ^Vv*. (14) 

This size distribution decays exponentially away from the percolation point and alge- 
braically at the percolation point, c^(l) ~ (27r)~ 1//2 /c~ 5//2 . Both behaviors follow from 
the scaling behavior Ck(t) — > (1 — ?) 5< J>c(/c(l — t) 2 ) with the typical component size 

k~(l-t)~ 2 (15) 

and the scaling function <i>c( x ) = (2;r) _1 / 2 jc _5 / 2 exp(— x/2). The large-size algebraic 
decay of the size distribution is reflected by the small-argument behavior of the scaling 
function. Hence, the size distribution exhibits dynamical scaling in the vicinity of the 



1 A convenient solution method is as follows. The time dependence is "peeled" first, c\ = Ckt k l e kt 
with the coefficients satisfying (k — 1)Q = Y,i+j=k(iCi){jCj)- The generating function G(z) — Y,kkCk ekz 
satisfies the differential equation (1 — G)G' = G and consequently Ge~ G = e z . The coefficients are found 
using the Lagrange inversion formula [13]. The very same technique can be used to derive the joint 
distributions described in the next section. 



percolation transition. This behavior is generic: size distributions obey dynamical scal- 
ing near the percolation point and for example, both the path length density (4) and the 
cycle length density (7) can be written in a self-similar form. 

Other statistical properties including for example the moments follow from the gen- 
erating function, c(z,t) = T,kkck(t)e kz , that can be written explicitly 

c(z,t)=r 1 G(z + \nt-t), (16) 
in terms of the auxiliary generating function 

°° ik-l 

G W = I -rr" z - ( 17 ) 
k=i K - 

This function satisfies Ge G = e z or alternatively, (1 — G)dG/dz = G. 

Combining the latter relation and the behavior of the generating function near z = 0, 
the second moment result (13) is generalized to all times, M2 = u/[t(l — u)] with u = 
G(\nt — t). This quantity satisfies 

ue- u = te- t . (18) 

Let the fraction of nodes outside finite components be g = 1 — Mi with Mi = c(z = 0) =u/t. 
For t < 1, there is a single solution u = t and therefore all nodes are in finite components, 
g = 0. But when t > 1, there is an additional nontrivial solution, and as a result, this 
fraction becomes finite, g > 0. In particular, at the late stages of the process, the giant 
component contains almost all nodes, g(t) — > 1, and furthermore, since u ~ te~ l then 
1 — g ~ ci = e~* , indicating that other than the giant component, there are only a few 
isolated nodes. Also, just past the percolation point, the fraction of nodes outside finite 
components grows linearly, g(t) ~ 2(t — 1). 



FINITE COMPONENTS: JOINT DISTRIBUTIONS 

We have seen that components undergo an aggregation process of one kind and that 
links, paths, and cycles undergo growth or aggregation processes of another kind. By 
combining these separate processes into a bi-aggregation process [24] involving two 
variables, say the component size and the node degree, a more detailed analysis of 
structural properties of finite components is possible. 



Links 

Each node can be characterized by two indices: its degree / and the size k of the 
component it belongs to. The distribution // ^ of nodes of degree I in components of size 
k satisfies 

%= I UcMi-m,i + fi-i,i]-kfi,k. (19) 
m i+j=k 



The initial conditions are fi,k(0) = The first gain term accounts for linking 

events that leave the node degree unchanged (the added link involves other nodes in 
the component), while the second gain term represents linking events that augment the 
node degree by one. Of course, the component size distribution and the node-degree 
distribution can be obtained by summation of the joint distribution, so that Eqs. (1) and 
(12) can be recovered from (19). 

The joint distribution can be obtained analytically 

fiM- wm^ ^ 1 '-* <20) 

for 1 < / < k and /o,fc(0 = 5^.1 for / = 0. Fixing the node degree, the joint distribution 
decays algebraically at large sizes at the critical point fi^it = 1) ~ k~ 3 / 2 . Exponential 
decay occurs elsewhere. 

The generating function f{z,w) = Y,i,k ekZwl fi,k is expressed directly in terms of 
auxiliary generating function 

f(z,w) = e wG ( z+lnt -'^ +z - t . (21) 

Average quantities and correlations follow from the generating function, and for exam- 
ple, the average node degree, the average component size, and the average correlation 
between the two are 

1 2 

(l) = u, {k) = j^, (kl) = T ^. (22) 

Below the percolation transition, the average node degree equals time, (/) = t. Above 
the percolation transition, the average node degree is reduced (/) < t because nodes in 
finite components have less connections than the rest of the nodes. The average degree 
vanishes in the long time limit (/) ~ te~ l . In comparison, the fraction of isolated nodes 
is ci ~ e~ l . Interestingly, the properly normalized correlation between the node degree 
and the component size is time independent, (kl)/(k)(l) = 2. The node degree and 
the component size are correlated: nodes that have more links likely belong to larger 
components. 



Paths 

Since every two nodes in a component are connected, there must be a path connecting 
them. Let pi £ be the density of paths of length / in components of size k. There is the 
obvious bound < / < k — 1 and additionally, there is a sum rule Y*iPi,k = k reflecting 
that there are k 2 distinct paths in a component of size k. The density of linkless paths is 
Po,k = kc k . 

The path length and the component size separately undergo an aggregation process 
and combining the two processes, these two indices undergo a bi-aggregation process. 



The joint distribution evolves according to the rate equation 



~^f = £ Pn,iPm,j+ £ (ipij)(jcj)-kpi A . (23) 

i+ j—k i+ j—k 

n+m=l—l 

The initial conditions are p/,/t(0) = 5^i5/ ; o- There are two separate convolutions: one 
over the path length and one over the component size. The first term on the right-hand 
side of Eq. (23) describes newly formed paths due to linking and the last two terms 
correspond to paths that do not contain the newly placed link. 
The path length density is 

The two shortest paths satisfy po,fc = kck an d Pi,k = 2(k — l)c^. The latter reflects 
that there are k — 1 links in a (tree) component of size k. Also, the longest possible 
path, / = k— 1, corresponds to a linear (chain-like) component, and the density of 
such components, Pk-i,k = t k ~ l e~ kt , decays exponentially with length, so that such 
components are typically small. 

The path length density can be simplified in the limit k^> / >> 1 , 

/^~Z(2jf*V^~ 1 ^ 1 ~°«~ /2/2 *- (25) 

As was the case for the component size distribution, the path length density is self- 
similar in the vicinity of the percolation point, pi^ — > (1 — t) 2 <& p (k(l —t) 2 J(\ —t)), 
with the scaling function 

& p (x,y) =y(2nx 3 )- l/2 zM-y 2 /2x)- (26) 

The characteristic path length is as in (5) and the characteristic component size is 
as in (15). At the percolation point, the path length density (25) is governed by the 
factor exp(— I 2 /2k) and therefore, the typical path length scales as square root of the 
component size 

/ ~ k l/1 . (27) 

The generating function p(z,w) = T.i,k ekZwl Pl,k is expressed in terms of the auxiliary 
function (17) 

p(z,w) = t l - K — — J —. (28) 

l-wG(z + lnt-t) 

The total density of paths in finite components Ptot = Y,lkPl,k is therefore 
P\. \. = u/t{\ — u) and for t < 1 we recover p tot = \/(l — t). Expanding p(z,w) in 
powers of w, the total number of paths of length /, pi = Y,kPl,k> i s given by pi = t~ l u l+1 
with u satisfying (18), in accord with (4) for t < 1. 



Cycles 



We have seen that the cycle length distribution is coupled to the path length distribu- 
tion. In a similar way, the joint distribution of cycles in finite components of a given size 
is coupled to the joint distribution of paths of a given length in components of a given 
size. 

To characterize cycles in a given component size, we consider the joint distribution 
qi,k, the average number of components of size k containing a cycle of length / with 
I <l <k. This joint distribution evolves according to the linear rate equation 

-j^ = \pi-\,k+ £ {kl,i)Ucj)-kq hk (29) 
at 1 i+ j=k 

for / > 1. Initially, there are no cycles, and therefore qi^ity = 0- The first term on 
the right hand side represents generation of cycles from paths, and the next two terms 
represent merger events where only the component size changes. 
The joint cycle-length component-size distribution is 



1 k k ~ l - x 

2 (k-l)l' 



The smallest cycle, / = 1, is a self-connection, and the average number of such cycles 
is 1i,k = jkck. The largest cycles are rings, / = k, and their total number is on average 
q^k = ik~t ke ~ kt - As for linear chains, the number of rings decays exponentially with 
length. 

The large-fc behavior of the cycle length distribution is similar to (25) 

q hk {t) ~ (Snk 3 )- l / 2 t k e k ^e- l2 / 2k . (31) 

This distribution is self-similar in the vicinity of the percolation transition, 
Qi,k{t) — ► (1 -t) 3< &q (k(l -t) 2 J(\ —t)), with the scaling function & q (x,y) = 
(87Dc 3 )~ 1//2 exp(— y 2 /2x). We see that the cycle length is characterized by the same 
scale as the path length, / ~ (1 — t) . At the percolation point, the cycle length distri- 
bution (31) is dominated by the factor exp(— I 2 /2k) so that when the component size 
is fixed, the typical cycle length behaves as the typical path length, / ~ k 1 / 2 . Moreover, 
the size distribution of finite components containing a cycle, q k = qi^, decays as a 
power-law at the percolation point, ~ (4k)~ l . 

The joint generating function, q(z,w) = Y,l,k^ Zyvl <ll,h i s 

qM=1 2 ln i- wG (z+mt-ty (32) 

As for paths, statistics of cycles are directly coupled to statistics of components via 
the generating function G(z). The total number of cycle-containing components of 
finite-size, q to t = Y,ikH,h is therefore qt t(t) = \ ^ n j~^- Below the percolation point, 
<?tot(0 = ^ In y^7, for t < 1. Moreover, expanding q(z,w) in powers of w shows that the 



the cycle length distribution (in finite components only) is Qi = u /2l, in agreement with 
(7) prior to the percolation time (t < 1). 

FINITE GRAPHS 

Thus far, we used rate equations to describe infinite systems. While the rate equation 
approach can be extended to finite systems, the resulting equations are difficult to handle 
[25, 26]. When the number of nodes is finite, fluctuations are no longer negligible, and 
instead of a deterministic rate equation approach, a stochastic approach is needed. Finite- 
size scaling laws can be conveniently obtained by combining the exact infinite system 
results with scaling and extreme statistics arguments. 

Finite-size scaling laws of random graphs are quite interesting. For example, the giant 
component nucleates at a size that it much smaller than the system size. The size of 
the largest component in the system, M, can be estimated by employing the cumulative 
component size distribution and the extreme statistics criterion, NY,k>M c k(t = 1) ~ 1. 
Using Ck ~ k~ 5 / 2 gives 

M~N 2/3 . (33) 

The largest component in the system grows sub-linearly with the system size [3]. This 
component nucleates very close to the percolation time. The time x when this compo- 
nent emerges approaches unity for large enough systems as implied by the diverging 
characteristic size scale M ~ (1 — t)~ 2 , so that 

1-T~AT 1/3 . (34) 

Just past the percolation time, the size of the giant component grows linearly with time. 

For finite systems, the scaling laws for the typical path length (5) combined with the 
characteristic component size (33) yields a scaling law for the characteristic path (and 
cycle) length 

Z~iV 1/3 . (35) 

One can deduce several other scaling laws and finite-size scaling functions underlying 
the path length density. For example, substituting the percolation time (34) into the total 
number of paths P m = (1 — t)~ l yields the total path density P tot ~ Af 1 / 3 . Similarly, the 
total number of cycles at the percolation point grows logarithmically with the system 
size,£ tot (AO~ilnJV. 

In finite systems, it is possible that no cycle are created by the percolation time. This 
probability decreases algebraically with the system size, as seen from (9) and (34) 

S~AT 1/6 . (36) 

Moreover, combining the size distribution of the first cycle, G/(l) ~ Z~ 3 / 2 with the char- 
acteristic cycle scale / ~ iV 1 / 3 yields the moments of the size distribution corresponding 
to the first cycle 

(/") -A^ 3 - 1 / 6 . (37) 

In particular, the average size of the first cycle is much smaller than the characteristic 
cycle length (/) ^/V 1 / 6 . 



SUMMARY 



In summary, we used kinetic theory to describe structural properties of random graphs 
including paths, cycles, and components. Modeling the linking process dynamically 
shows that paths and components undergo separate aggregation processes. Cycles are 
generated by paths and thus, the cycle length distribution is coupled to the path length 
distribution. 

Generally, size distributions decay exponentially away from the percolation point, but 
at the percolation point, algebraic tails emerge. As the system approaches this critical 
point, the size distributions follow a self-similar behavior and they are characterized by 
diverging size scales. 

The kinetic theory approach is well-suited for treating infinite systems. Nevertheless, 
the behavior of finite systems can be obtained from heuristic scaling and extreme 
statistics arguments. This yields scaling laws for the typical component size, path length, 
and cycle length at the percolation point. 

The rate equation approach is powerful in that it utilizes a continuous time variable 
and therefore, differential, rather than difference equations. It has been successfully used 
to model growing random networks and it should be applicable to more complex random 
structures. 
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